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Abstract 

We derive a general limit on the fidelity of a quantum channel conveying an 

ensemble of pure states. Unlike previous results, this limit applies to arbitrary 

coding and decoding schemes. This establishes the converse of the quantum 

noiseless coding theorem for all such schemes. 
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I. QUANTUM ENCODING AND DECODING 



One of the central problems in quantum information theory M is the transmission of 
pure quantum states from a sender to a receiver using the least possible channel resources. 
Suppose Alice generates the state |a^) of the system Q with probability pi. This is encoded 
by some (possibly mixed) state Wi of the channel system C (generally of smaller Hilbert- 
space dimension than Q) and delivered to Bob, who performs a decoding operation giving 
a state Wi of Q. We assume that no "noise" is present in the system except that introduced 
in the coding and decoding processes. Letting 7Tj = |a^) (a^l , this may be represented by 

The decoded state W{ is not necessarily required to equal 7Tj exactly; it will suffice for Alice 
and Bob if the inputs and outputs are sufficiently close to each other. The "closeness" of 
the input and output states is measured by the average fidelity F: 

F = J2p l F(n l ,w l ) , (1) 

i 

where F(iTi,Wi) = Tr iiiWi is the probability that W{ will pass a test that checks its identity 
against 7r$. Alice and Bob will succeed in their task if F is close to unity, and fail if it is not. 
Our problem is to characterize the minimal channel resources, i.e., the minimal dimension of 
the support of the states Wi, which are necessary and sufficient for high fidelity transmission 

EH- 

This process of retrieving faithful copies of the input states from the states of the channel 
has applications in quantum cryptography, where nonorthogonal states represent encrypted 
classical information [||,|5], and in problems of efficient information storage and retrieval for 
quantum computers |J. 

The decoding operation Wi — > Wi must be accomplished without any "side 
information" — i.e., the only information possessed by Bob about the input state is his knowl- 
edge of the message ensemble and the coding procedure that prepares the channel C. Bob's 
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decoding procedure must be a dynamical evolution that is specified apart from the state on 
which it acts. On the other hand, we make no such assumption about Alice's encoding op- 
eration, so that the association 7Tj — > Wi is completely arbitrary. Indeed we generally allow 
Alice to have knowledge of the identities of the specific input states and she is therefore able 
to effect arbitrary encodings. In contrast, Bob is unable to reliably identify the (generally 
nonorthogonal) channel states Wi so his decoding procedure is restricted by the laws 
of quantum mechanics as described in §3 below. 

Note that the encoding procedure here is more general than the scenario in which Alice 
is required to encode the input states without knowledge of their identities (knowing only 
their a priori distribution). In this situation the allowable encodings 7Tj — > Wi are no longer 
arbitrary but subject to restrictions analogous to those on Bob's decoding procedures. (This 
is in contrast |8|||] to the corresponding situation with classical signals which may always be 
reliably identified without disturbance.) A remarkable consequence of the quantum noiseless 
coding theorem and its converse described below is that the minimal channel resources for 
high fidelity transmission in this situation are asymptotically the same as those for the case 
where Alice is able to apply arbitrary encoding processes, i.e., knowledge of the identity 
of the input states does not lead to any reduction of channel resources. Indeed in an 
explicit encoding scheme is described which achieves (asymptotically) the minimal channel 
resources and this scheme operates without knowledge of the identity of the input states 
(being dependent only on their a priori distribution). 

The quantum noiseless coding theorem proved in 0,[3| relates the achievable average 
fidelity F to the size of the channel system. This size is given in terms of the number of 
two-level systems, or qubits, that comprise the channel when coding is performed on large 
blocks of signals drawn identically from the original message ensemble.^] Suppose we have 



Of course, the description of the channel in terms of qubits is mere convenience. Any channel 
described by a Hilbert space of dimension d is equivalent for our purposes to log d qubits. 
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input states 7Tj with probabilities pi, as before, and let p = YliPi^i be the density operator 
describing the input ensemble. The von Neumann entropy of p is given by 

5(p) = -Trplogp, (2) 

where the base of the logarithm is 2. Then the quantum noiseless coding theorem states: 

Let e, 5 > 0, and suppose S(p) + 5 qubits are available in the channel per input 
state. Then for all sufficiently large N, there exists a coding and a decoding 
scheme which transmits blocks of N states with average fidelity F > 1 — e. 

In other words, the von Neumann entropy is a measure of the channel resources (in qubits) 
sufficient to transmit quantum states with arbitrarily high average fidelity. A converse to 
the theorem has also been given. 

Let e, 6 > 0, and suppose S(p) — 5 qubits are available in the channel per input 
state. Then for all sufficiently large N, for any coding and decoding scheme for 
blocks of iV states, the average fidelity satisfies F < e. 

This converse states that the von Neumann entropy is a measure of the channel resources 
necessary to transmit quantum states with high average fidelity. 

In this formulation, the converse refers to all possible coding/ decoding schemes. However, 
the proof given in @ and H implicitly assumes that the decoding scheme is unitary — that 
is, that the map Wi — > Wi is a unitary mapping from the channel's Hilbert space into 
the Hilbert space of the decoded signals. There are still other possibilities that must be 
considered. For example, the decoding scheme might involve a measurement, the discarding 
of an entangled subsystem, or any other process allowed within the the laws of physics. The 
converse of the quantum noiseless coding theorem cannot be established in full generality 
without considering all conceivable decoding schemes. Indeed in an Appendix we present a 
simple example containing all the salient features of this problem that shows for particular 
(nonoptimal) encodings it is possible for nonunitary decodings to provide higher fidelity 
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than any unitary decoding scheme. Therefore the issue of real concern for the converse is 
whether such nonunitary decoding schemes add any power to optimal encodings. 

Our aim in this paper is to complete the general proof of the converse of the quantum 
noiseless coding theorem by establishing a lemma that links the average fidelity F of the 
decoded signal states to the size of the channel system and to properties of the density 
operator p of the ensemble of input states. This fidelity lemma may also prove useful in 
other contexts. 



II. FIDELITY 

Suppose pi and p 2 are density operators describing states of a quantum system Q. We 
can always imagine that these mixed states arise by a partial trace operation from pure 
states of an extended system QA. That is, there are states |1) and |2), called "purifications" 
of pi and p2, for which 

Pi = Tr A |l)(l| 
p 2 = Tr A |2)(2|. 

We define (as in [fL0| ) the fidelity F(pi,p 2 ) by 

F( Pl ,p 2 ) = max|(l|2)| 2 , (3) 

where the maximum is taken over all purifications |1) of p\ and |2) of p 2 . Thus, the fidelity 
is the largest squared inner product between purifications of two density operators. This 
definition provides a generalization to mixed states of the natural squared inner product 
measure of fidelity for pure states. 



Basic properties of this notion of fidelity are described in detail in [plj and we note the 
following. 

• < F(pi, p 2 ) < 1 and F(p 1 , p 2 ) = 1 if and only if p 1 = p 2 . 

• F(p u p 2 ) = F(p 2 ,p 1 ). 



If one of the states p\ is a projection 7Ti, i.e., a pure state, then we have the more 
direct expression 

F(iri,p2) = Tr7Tip 2 • 

(A general expression for arbitrary mixed states is given in pi but this is not required 



in the present work.) 

• In defining the fidelity for mixed states, it is sufficient to fix any one of the purifications 
|1) of pi and take the maximum of |(1|2)| 2 over arbitrary purifications |2) of p 2 . 

We can extend the definition of fidelity from normalized states to subnormalized states 
(in which Tr p\ < 1) in an obvious way, by requiring that the purifications have the same 
normalization: (1|1) = Trp!. 

We now establish a useful inequality for fidelity. Let pi, p 2 , and p 3 be states, and let 
F12 = -F(pi,p 2 ), etc. We will require that Trp 3 = 1, but pi and p 2 may be subnormalized. 
Then 



^13 < ^23 + 2(1-^) +2v^/f 23 (1-v^) • (4) 



This implies that if _F 12 is close to unity and _F 23 is close to zero, then F 13 must also close to 
zero. 

The proof is not difficult. We construct purifications for our states with these properties: 

• All inner products ((1|2), etc.) are real and nonnegative, 

• F 12 = (l|2) 2 , 

• Fi3 = (l|3) 2 . 

This can be done by the following procedure. We fix |1) and choose |2) and |3) so that 
F 12 = |(1|2)| 2 and F 13 = |(1|3)| 2 . Next we adjust the phases of |1), |2), and |3) to satisfy 
the first condition. Clearly, F 23 > (2|3) 2 . 
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Let \x) = |2) - |1). Then 



(x\x) = (1|1) + (2|2)-2(1|2) 

< 2(1- \[F 12 



because p\ and p 2 may be subnormalized. Furthermore, 



Fi3 = (1|3) 

= (2|3>-(x|3> 
<^3+\(x\3)\ 
< JKs + \/ {x\x} 



< ^23 + ^2(1-^12 

Thus, 



F13 < F 23 + 2(l- 7^ ] +2V2J 



as we wished to prove. 

We note in passing that, if we relax the condition that Trp 3 = 1, we arrive at the more 
general inequality for subnormalized states: 



F 13 <F 23 + 2Trp 3 (l-V^) 



+ 2y / 2Trp 3 ^23 (l- V^) • 

III. CHANNEL SIZE AND FIDELITY 

The "size" of the channel system C is specified by the dimension d of the Hilbert space 
describing C. If C is composed of M qubits, then d = 2 M . This means that in the process 

7T; > Wi > Wi . 
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the channel states Wi are operators on a d- dimensional Hilbert space. For convenience, 
we will imagine that the Wi actually act on a d- dimensional subspace of the n-dimensional 
Hilbert space describing the system Q. (We could always modify our decoding procedure 
so that the channel states were first unitarily moved into the output system Q and then 
subjected to a more general decoding process. The Wi states would then be the unitary 
images of the channel states in Q's Hilbert space.) 

We are now ready to state our result. Imagine that an ensemble of pure states of Q (in 
which the state 7Tj appears with probability is described by a density operator p = J2iPi^i- 
Let Aj be the eigenvalues of p, listed in descending order (so that Ai > . . . > A n ), and let 
|Aj) be associated eigenvectors. 

Fidelity lemma: Suppose the dimension of the Hilbert space for the channel is 
d, and write 

d 

i=l 

Then, for any encoding and decoding procedures, F < 6r/. 
To prove this lemma, we first note that 

d 

i=i 

so that Ad + i < i]/d. Now we construct a projection operator 

n 

A = E |Ai>(Ai| , 

i=d+l 

which is the projection onto the subspace spanned by the eigenvectors corresponding to the 
n — d smallest eigenvalues of p. We use A to project the input states 7Tj into (subnormalized) 

states nf 

TXi = AlTiA 

P = Y^P^i = A P A ■ 

i 
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The largest eigenvalue of p is just \d+i- 

Our plan is as follows. (For heuristic purposes and later application, we have in mind a 
situation with rj small.) First, we will show that the original input states 71", are, on average, 
close to the projected states 7^. Then we will show that the average of Ffa, Wi) is small for 
all possible coding/decoding schemes. Using the fidelity inequality in equation | above, we 
will conclude that the average of F(iTi,Wi) must therefore be small. The qualitative phrases 
"close to" and "small" will be quantified by the value of rj. 

Anticipating somewhat, we first find a lower bound for the average of the square root of 
F(iri,TXi). Recall that 7Tj = \ai)(a,i\. 

i i 

= J2Pi\J( a i\M a i)( a i\M a i) 
i 

i 

= TrpA 

= 1-71. (5) 

We wish the decoding procedure to be as general as possible. Therefore we only require 
that the procedure be specifiable independently of the state Wi to which it is applied, and 
that it is an allowable quantum dynamical evolution. The most general dynamical evolution 
possible in quantum mechanics is a completely positive map on the space of density operators 
]TT| . Such a map can always be modeled by a unitary interaction between the system Q and 
an ancilla system A (initially in some standard pure state |0o))> after which A is discarded. 
We can therefore write 

Wi = Tr A U(Wi® |0o)(0o|)^ f (6) 

for some unspecified unitary U. 

We can use this general form to find an upper bound for the average of F(7Ti,Wi). Note 
that, although 7Tj is subnormalized, it is still an operator of rank 1, and thus we can write 
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the fidelity as Tr iiiWi. Let Yd be the projection onto the <i-dimensional subspace occupied 
by the channel states Wi. Then, writing the trace over the Q Hilbert space as Tig, etc., 

Ffawi) = Y l PiTr Q iri(Tr A U(W i ® |0 O > (^l)^ 1 ") 

i 

= ^QA (Tfi ® l A )U(Wi ® |0o)(0o|)^ f 

i 

< Y,Pi Tl QA (tt* <S> U)U(T d ® |0 o )(0o|)^ f 

i 

= Tt qa {p®l A )U(T d ® |0 o )(0o|)f/ f • 

Now, every eigenvalue of p ® 1a is an eigenvalue of p. Furthermore, the operator U (T d ® 
\4>o) ((j)o\)W is a projection onto a <i-dimensional subspace. The trace will therefore be less 
than or equal to the sum of the d largest eigenvalues of p ® 1 A , which in turn can be no 
larger than d X d+ i- 

^PiTlQTTiWi < d\ d+1 

i 

<d(Vl=V- (7) 



We now find an upper bound for F by applying the fidelity inequality in equation [| to 
each term in the average: 



Ffa,Wi) < Ffc,Wi) + 2(1- y/Ffaiti 



+ 2J2F(if i ,w i )[l-JF(<K i ,jr 



Zi 

We will bound the averages X, Y, and Z separately. 
We have already bounded X in equation [7|. 

X = ^PiXi = Y^ Pi TrQ7fjWj < r] . 

i i 

Similarly, the bound for Y follows from equation || 

i 

= 2^1-^ PlV /F(7r,,^)) 
= 2r] . 
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To find an upper bound for Z, we use these two results together with the Schwarz inequality: 

z = ^ZptZi 

i 

i 

<2 ] jY /Pt X l ^p 3 Y J 
< 2V2rj . 

Therefore, 

F =X+Y+Z 

<7] + 2r] + 2V2r] < <or] , 

which is what we wished to establish. 

We point out once again that no assumption has been made about the encoding procedure 
7Tj — > Wi- This may be completely arbitrary. We do not require that it be accomplished by 
a process that is "blind" to the input state 7Tj, that is, by a completely positive map. This 
means that we are allowing Alice to be completely cognizant of the identity of the input she 
is representing in the channel, even though it may be one of a nonorthogonal (and hence 
imperfectly distinguishable) set. 

We note finally that the bound F < 6rj is quite likely to be loose. For example, in |2| 
and ||, where the decoding scheme was assumed to be unitary, a bound of F < rj was 
derived. This bound for unitary decoding is achieved by a very natural coding/decoding 
scheme — Wi is the renormalized projection of 7Tj into the subspace corresponding to p's 
largest d eigenvalues and the unitary decoding is just the identity. Denoting the projector 
onto this subspace by T^, the fidelity may be written (taking the sum to exclude i such that 
7T, are orthogonal to Yd, which make zero contribution to average fidelity however they are 
encoded): 

i V TrTTiiV 
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(ai\T d \ 


a 






(ai\Td\ 


Oi) 



i 

= Tr pT d 
= V ■ 

Nevertheless the bound of 677 suffices for proving the converse of the quantum noiseless 
coding theorem. 



IV. QUANTUM CODING 

Suppose the input state 7Tj of Q occurs with probability pi, so that the ensemble of 
inputs is described by p = Y.iVi^ii as above. Further suppose that a long sequence of N 
such inputs, generated independently, is available. The ensemble of iV-sequences of input 
states is then described by 

N 

p N = p o . * . p . 

For sufficiently large N, the structure of p N is characterized by a typical subspace T/v 

The typical subspace may be described as follows. Fix e, 5 > 0. Then for sufficiently 
large N, there exists a subspace T N spanned by eigenstates of p N such that 

• If IT is the projection onto TJv, then 

Tr Tlp N Tl > 1 - e . 

• If I A) is an eigenstate of p N with eigenvalue A, and |A) G T N , then 

2 -N(S( P )+5) < ^ < 2 -N(S(p)-8) _ 



Now suppose that a sequence of inputs is encoded somehow into a set of qubits, 
so that S(p) — 25 qubits are used per input. The Hilbert space describing the channel of 
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N(S(p) — 25) qubits will have dimension d = 2 N! ^ S ^^ 25 \ The channel states are used in 
some decoding procedure to produce an output state of N copies of Q. 

According to our fidelity lemma, we can bound the fidelity of this process by calculating 
the sum of the largest d eigenvalues of p N . We will denote this by E d . This sum must 
certainly be smaller than the sum of all of the eigenvalues outside the typical subspace T N 
plus d times the largest eigenvalue inside T N . That is, 

^ d <e + d2~ N ^^ 

= e + 2 N (S(ft)-2S) 2 -N{S(ft)-S) 

= e + 2~ m . 

For sufficiently large N, < 2e. Thus, by our fidelity lemma, F < 12e. Letting 5 — S'/2 
and e = e'/12, we find that if S(p) — 5' qubits are available per input, then for sufficiently 
large N the average fidelity F < e'. This establishes the converse to the quantum noiseless 
coding theorem for the most general sort of coding and decoding schemes. 

V. APPENDIX 

We demonstrate here by explicit example that decoding schemes more general than the 
set of unitary ones can be of some benefit in situations of nonoptimal coding. 

Consider three signal states |a ), \a>i), and (02) which are all real positive linear combi- 
nations of three fixed orthonormal vectors, so that we may picture them as vectors in the 
positive octant of H 3 . The states form three edges of a regular tetrahedron with the origin 
as their common vertex, and thus are all 60° apart. The states \ao) and \ai), in particular, 
are assumed to be in the positive quadrant of the x-y plane, each vector having an angle 
of 15° between itself and the nearest axis. The prior probabilities for the signal states are 
.49, .49, and .02, respectively. The encoding scheme associates the orthogonal projectors W 
and W\ onto the x and y axes, respectively, with the states \ao) and \ai). It associates the 
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density matrix 

W 2 = ^\a )(a \ + -|ai)(ai| , 

corresponding to an equal mixture of |a ) and |ai), with the state |a 2 ). Note that the set of 
encoded states has a two-dimensional support, i.e., a support smaller than that containing 
the signal states. 

Because the signal state |a 2 ) has such a small prior probability, the symmetry of this 
encoding should make it clear that the best unitary decoding scheme will be only slightly 
different from not decoding at all. (Actually, detailed calculation demonstrates that the 
optimal unitary decoding is to rotate the encoded states by 0.791° toward |a 2 ), but this only 
changes the average fidelity in the fourth significant figure.) Making this approximation, the 
average fidelity for this decoding scheme is 

F — 2 x .49 x cos 2 15° + .02 x cos 2 60° = .919 . 

However there exists a simple nonunitary decoding scheme that achieves a better fidelity 
than this. Since some of the signals are encoded in orthogonal alternatives, it is plausible 
that a decoding device can use a measurement to gather information about the signal and 
use that information to produce decoded states that are closer, on average, to the originals. 
In particular, the decoding device can do the following. It first measures the observable 
corresponding to the x-y axis. If the outcome is x, it outputs the state wo = tt ; if the 
outcome is y, it outputs the state W\ = -k\. Thus in the cases that Q was actually prepared 
in |a ) or |ai), the transmissions will have perfect fidelity. In the case that |a 2 ) was the 
actual signal state, the fidelity of the transmission will still be cos 2 60° = .25. Therefore the 
average fidelity for this nonunitary decoding scheme is F = .985, and this certainly beats 
the unitary scheme. 

This simple example demonstrates that in some cases involving particular nonoptimal 
encoding schemes, it is possible for nonunitary decoding to increase the fidelity of a quantum 
channel. Nevertheless the converse of the quantum noiseless theorem implies that nonunitary 
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decodings provide no asymptotic advantage over unitary decoding schemes in the problem 
of minimizing of channel resources over all possible coding/decoding schemes. 
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